53 research outputs found
Improving I/O Bandwidth for Data-Intensive Applications
High disk bandwidth in data-intensive applications is usually
achieved with expensive hardware solutions consisting of a large number
of disks. In this article we present our current work on software methods
for improving disk bandwidth in ColumnBM, a new storage system for
MonetDB/X100 query execution engine. Two novel techniques are discussed:
superscalar compression for standalone queries and cooperative
scans for multi-query optimization
Detecting genuine multipartite entanglement of pure states with bipartite correlations
Monogamy of bipartite correlations leads, for arbitrary pure multi-qubit
states, to simple conditions able to indicate various types of multipartite
entanglement by being capable to exclude the possibility of k-separability.Comment: journal versio
Micro Adaptivity in Vectorwise
Performance of query processing functions in a DBMS can be affected by many factors, including the hardware platform, data distributions, predicate parameters, compilation method, algorithmic variations and the interactions between these. Given that there are often different function implementations possible, there is a latent performance diversity which represents both a threat to performance robustness if ignored (as is usual now) and an opportunity to increase the performance if one would be able to use the best performing implementation in each situation. Micro Adaptivity, proposed here, is a framework that keeps many alternative function implementations ("flavors") in a system. It uses a learning algorithm to choose the most promising flavor potentially at each function call, guided by the actual costs observed so far. We argue that Micro Adaptivity both increases performance robustness, and saves development time spent in finding and tuning heuristics and cost model thresholds in query optimization. In this paper, we (i) characterize a number of factors that cause performance diversity
between primitive flavors, (ii) describe an e-greedy learning algorithm that casts the flavor selection into a multi-armed bandit problem, and (iii) describe the software framework for Micro Adaptivity that we implemented in the Vectorwise system. We provide micro-benchmarks, and an overall evaluation on TPC-H, showing consistent improvements
DSM vs. NSM: CPU Performance Tradeoffs in Block-Oriented Query Processing
Comparisons between the merits of row-wise storage (NSM)
and columnar storage (DSM) are typically made with respect
to the persistent storage layer of database systems. In
this paper, however, we focus on the CPU efficiency tradeoffs
of tuple representations inside the query execution engine,
while tuples flow through a processing pipeline. We
analyze the performance in the context of query engines using
so-called "block-oriented" processing --- a recently popularized
technique that can strongly improve the CPU efficiency.
With this high efficiency, the performance trade-offs
between NSM and DSM can have a decisive impact on the
query execution performance, as we demonstrate using both
microbenchmarks and TPC-H query 1. This means that
NSM-based database systems can sometimes benefit from
converting tuples into DSM on-the-fly, and vice versa
Plasminogen activator inhibitor-1 (PAI-1) and urokinase plasminogen activator (uPA) in sputum of allergic asthma patients.
Urokinase plasminogen activator (uPA) and its inhibitor (PAI-1) have been associated with asthma. The aim of this study was to evaluate concentration of uPA and PAI-1 in induced sputum of house dust mite allergic asthmatics (HDM-AAs). The study was performed on 19 HDM-AAs and 8 healthy nonatopic controls (HCs). Concentration of uPA and PAI-1 was evaluated in induced sputum supernatants using ELISA method. In HDM-AAs the median sputum concentration of uPA (128 pg/ml; 95% CI 99 to 183 pg/ml) and PAI-1 (4063 pg/ml; 95%CI 3319 to 4784 pg/ml) were significantly greater than in HCs (17 pg/ml; 95%CI 12 to 32 pg/ml;
Cooperative scans
Data mining, information retrieval and other application areas exhibit a query load with multiple concurrent queries touching a large fraction of a relation. This leads to individual query plans based on a table scan or large index scan. The implementation of this access path in most database systems is straightforward. The Scan operator issues next page requests to the buffer manager without concern for the system state. Conversely, the buffer manager is not aware of the work ahead and it focuses on keeping the most-recently-used pages in the buffer pool. This paper introduces cooperative scans -- a new algorithm, based on a better sharing of knowledge and responsibility between the Scan operator and the buffer manager, which significantly improves performance of concurrent scan queries. In this approach, queries share the buffer content, and progress of the scans is optimized by the buffer manager by minimizing the number of disk transfers in light of the total workload ahead. The experimental results are based on a simulation of the various disk-access scheduling policies, and implementation of the cooperative scans within PostgreSQL and MonetDB/X100. These real-life experiments show that with a little effort the performance of existing database systems on concurrent scan queries can be strongly improve
- …